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Abstract 

In most biological studies and processes, cell proliferation and popula- 
tion dynamics play an essential role. Due to this ubiquity, a multitude of 
mathematical models has been developed to describe these processes. While 
the simplest models only consider the size of the overall populations, others 
take division numbers and labeling of the cells into account. In this work, 
we present a modeling and computational framework for proliferating cell 
population undergoing symmetric cell division. In contrast to existing mod- 
els, the proposed model incorporates both, the discrete age structure and 
continuous label dynamics. Thus, it allows for the consideration of division 
number dependent parameters as well as the direct comparison of the model 
prediction with labeling experiments, e.g., performed with Carboxyfiuores- 
cein succinimidyl ester (CFSE). Wc prove that under mild assumptions the 
resulting system of coupled partial differential equations (PDEs) can be de- 
composed into a system of ordinary differential equations (ODEs) and a 
set of decoupled PDEs, which reduces the computational effort drastically. 
Furthermore, the PDEs are solved analytically and the ODE system is trun- 
cated, which allows for the prediction of the label distribution of complex 
systems using a low-dimensional system of ODEs. In addition to modeling 
of labeling dynamics, we link the label-induced fluorescence to the measure 
fluorescence which includes autofluorescence. For the resulting numerically 
challenging convolution integral, we provide an analytical approximation. 
This is illustrated by modeling and simulating a proliferating population 
with division number dependent proliferation rate. 

Keyword: Proliferating population, label-structured population, flow cytom- 
etry, CFSE, division-structured population 



1 Introduction 



Cell proliferation is a central aspect of most biological processes, among others 
bacterial growth [H [2], immune response [21 HI [3], stem cell induced tissue re- 
modehng [6l [7] , and cancer progression [8] . Depending on the biological process, 
cellular proliferation has different characteristics. Cell division can be symmet- 
ric or asymmetric, and the daughter cells may or may not inherit the age of the 
mother cells (see Figure [T]). While many micro-organisms, such as the budding 
yeast, grow a daughter cell j9j which does not inherit the age of the mother cell 
and is born young, in most multicellular organism the mother cell divides sym- 
metrically into two daughter cells which inherit the age of the mother cell jlOj . 
The latter proliferation type - which will be the focus of this work - results in an 
accumulation of DNA damage and telomere shortening, which may be interpreted 
as aging of the individual cell. This results in a reduced proliferation potential, a 
reduced proliferation speed and finally in cell cycle arrest [101 HH IE] , known as 
senescence [13]. This has been discovered by Hayflick [TH] in the 1960s and the 
upper limit for the number of cell divisions a normal cell can undergo has been 
termed Hayflick limit. 

A variety of approaches are employed to investigate proliferation, ranging from 
the analysis of the cell cycle [2] to the model-based study of population hetero- 
geneity and subpopulations [6|. Nowadays, for human cell lines especially label- 
based proliferation assays are used to analyze the proliferation dynamics of cell 
populations. Common labels are Bromodeoxyuridine (BrdU) [15| and Carboxyfiu- 
orescein succinimidyl ester (CFSE) \TU\, while mainly the latter is used in recent 
studies. 

CFSE is a fluorescent cell staining dye which stays in cells for a long time and 
is distributed at cell division approximately equally among daughter cells. Thus, 
the proliferation of labeled cells results in a progressive dilution of the dye |17], 
as depicted in Figure [2| and quantitative information about the proliferation dy- 
namics can be gathered using flow cytometry [IB]. To determine the proliferation 
properties of cells, e.g., the rates of cell division and of cell death, from these data, 
analysis tools are required. The first proposed approaches employ peak detection 
and devolution [5l [HI [19] . Unfortunately, these methods are only applicable if the 
modes, corresponding to cells with a common division number, are well separated 
and if the data are not strongly noise corrupted. To overcome these limitations, 
different model-based approaches have been introduced. 

In the literature, mainly three different classes of population models are de- 
scribed: exponential growth models, division-structured population models and 
label-structured population models. The exponential growth models (EGM) are 
the simplest ones, and merely describe the number of individuals in a cell popu- 
lation. For this task a one-dimensional ODE, like the Gompertz equation [T], is 
sufficient. While exponential growth models allow the description of the prolifer- 
ation of many bacterial populations, they are in general not capable of describing 
the dynamics of human tissue cells. One reason for this is that the cell divi- 
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(a) Mother cells undergoing symmetric cell division split up into almost identical 
daughter cells. The difference of daughter cells is merely caused by the stochasticity of 
DNA replication, resulting in genetic and epigenetic differences, and the stochasticity 
of cell partitioning, yielding different protein abundances. 




(b) Mother cells undergoing asymmetric cell division yield daughter cells with different 
cell fates. In the most extreme case, the mother cells grows a daughter cell - this is 
called budding. The daughter cell is genetically identical to the mother cell but is 
born young, meaning that it does not inherit the age of the mother cell. 

Figure 1: Illustration of symmetric and asymmetric cell division. The color of the 
cells indicates the cell's age. In case of symmetric cell division the age strongly 
correlates with the number of divisions a cell has undergone up to this point. 

sion and cell death rates are found for many cell systems ^0\, e.g., B cells 
T cells osteoblasts [20], to depend on the division number. To capture these 
effects, a multitude of division-structured population models (DSP) has been in- 
troduced [ll[T9l|211l22l|23l|2l[25l|26l|271. The state variables of these models 
describe the sizes of the subpopulations, which are defined by a common divi- 
sion number. Hence, these models allow for the consideration of division number 
dependent properties. Still, these models do not provide information about the 
label concentrations and thus cannot be compared to data directly but require 
complicated and error-prone data processing. 

To avoid this, label-structured population models (LSP) are employed pS] . 
These models describe the evolution of the population density on the basis of 
a one-dimensional hyperbolic PDE. Hence, they provide predictions for the la- 
bel distributions at the individual time points and may be fitted to data di- 
rectly [T71 12HI l29l [30]. This renders complex data processing redundant and sim- 
plifies the model-data comparison. Still, these models do not allow for a direct 
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consideration of division number dependent parameters. To partly circumvent 
this problem, complex dependencies of the cell division and cell death rate on time 
and label concentration are introduced [29]. These are neither intuitive nor easy 
to interpret. Furthermore, the simulation of label-structured population models 
is computationally demanding and requires discretization, entailing further prob- 
lems. 

In the following a model is presented and analyzed which combines the division- 
structured population models and the label-structured population models and 
thereby overcomes their individual shortcomings. This population model, which 
we termed division- and label-structured population model (DLSP), is based on 
our own work p^. The same model has later also been used in [32| [33 ] for param- 
eter estimation, including slight modification. Here we provide the first rigorous 
in-depth assessment its properties. 

The DLSP model is introduced in Section |2] and incorporates both aspects: 
Discrete changes of the cell division number due to cell divisions and continuous 
dynamics of the label distribution. The overall model is a system of coupled par- 
tial differential equations. We discuss how this system of PDEs can be split up 
into two decoupled parts in Section |3| namely a single PDE and a set of ODEs, 
which significantly simplifies the solution. The obtained model is reduced further 
by truncation of the state space. This truncation and the resulting truncation 
error can be controlled using the a priori error bound which we derive. As the pro- 
posed model unifies the existing models, we outline the relations of the models in 
Section |4} In Section [6| the method is employed to study a population model with 
division number dependent division rates and an analysis of the computational 
complexity of the model is performed. The paper is concluded in Section [7j 

2 Modeling division- and label-structured popu- 
lations 

As outlined above, the study of proliferation dynamics in cell populations using 
labeling methods requires the consideration of two important distinct features: 

■ the label concentration x and 

■ the number of cell divisions i a cell has undergone. 

The importance of the label concentration x G IR+ (with ]R_|_ := [0, oo)) arises from 
the fact that this is the quantity which can be observed, e.g., using flow cytometry 
or microscopy [18] . On the other hand, a direct observation of the number of cell 
divisions i G Nq a cell has undergone is in general not possible, though the division 
number often plays a crucial role within the model. A cell which has divided once 
is expected to have different properties, e.g. a different division rate, than a cell 
which has already divided several dozen times [101 120] . 

In this paper we propose a model which captures both features of cells, distinct 
division numbers as well as distinct label concentrations among cells. Therefore, 
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Figure 2: Illustration of label dilution due to cell division. The division process 
results in halving of the concentration at each cell division (top), and the label 
intensity distribution within the cell populations (bottom). The latter one is ac- 
cessible, e.g., via labeling with CFSE. 



instead of a single PDE model describing the label dynamics of the overall pop- 
ulation, a PDE model is defined for every subpopulation. Thereby, the ith sub- 
population contains the cells which have divided i times. Cell division generates 
a flux from subpopulation i to subpopulation i + 1, thus inducing coupling. The 
system of coupled PDEs is given by 

+ 27ai_i(t)Ari_i(t,7x), 

with initial conditions 

i = : A^o(0, x) = Nq^q{x), Vi > 1 : Ni{Q, x) = 0. 

In this system, Ni{t,x) : IR+ x ]R_|_ — )• ]R+ denotes the label density in the ith 
subpopulation at time t. The structure of the models for the individual subpop- 
ulations is highly similar to a single PDE which is employed in label-structured 
models [28]. The fluxes influencing the label distribution Ni{t,x) are: 
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■ d{v{t, x)Ni{t, x))/dx, decay of label x in each cell with label loss rate x). 

m — {ai{t) + f3i(t)) Ni(t,x), disappearance of cells from the ith subpopulation 
due to cell division with rate ai{t) and due to cell death with rate (3i{t). 

m 2'~fai-i{t)Ni_i{t,'~fx), appearance of two cells due to cell division in the {i — 
l)th subpopulation with division rate The factor 7 G (1,2] is the 

rate of label dilution due to cell division (cf. [29l fT7]). 

It has to be emphasized that the division rates ai{t) : IR+ — t- ]R_|_ as well as the 
death rates /3j(t) : IR+ — )■ IR+ may depend on division number i and time t. To 
ensure existence and uniqueness of the solutions we require ai(t),/3i(t) G C^. As it 
is assumed that the labeling does not affect cell function, we do not allow and 
Pi to depend on the label concentration x. Furthermore, only label loss rates are 
considered which follow a linear degradation 

u{x) = -k{t)x. (2) 

The time dependence of the degradation rate may be arbitrary, but mainly con- 
stant degradation processes [2HI IIZI EH], k{t) = k (const.), or Gompertz decay 
processes, k{t) = cie"'^^*[30], are used. 

Note that by construction model ([T]) provides information about cell numbers 
and label density for the overall as well as for individual subpopulations. Hence, 
it combines advantages of common ODE models [H EH 122] and common PDE 
models [T71 |2S1 121] of cell populations and permits for more biologically plausible 
degrees of freedom than both of them. In detail, the available information are: 

Number of cells in the subpopulations: Given Ni{t,x), the number of cells 
contained in the ith subpopulation can be computed as 

iV,(t)= [ N,{t,x)dx. (3) 

This number of cells may help to understand the relative contribution of subpop- 
ulations to the overall population. 

Normalized label density in the subpopulations: Given Ni(t, x), the label density 
within the ith subpopulation can be computed as 

, ^Hi^ foriV,(t)>0 
n,{t,x)={ Ni{t) (4) 

otherwise. 

The normalized label density provides the probability of finding a cell within the 
zth subpopulation with label concentration ^ G + Ax], 

px+Ax 

Prob(^ G [x, X + Ax]) = / ni{t,x)dx. (5) 

J X 
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Besides the properties of the subpopulations, the model permits also the anal- 
ysis of the properties of the overall population. The unnormalized label density in 
the overall cell population M(t, x) : IR+ x IR+ — > IR+ is given by 

M{t,x) = ^Ni{t,x). (6) 
Prom M{t, x) the overall population size 

POO 

M{t)= I M{t,x)dx = ^N,{t) (7) 
•^0 *eNo 



and the normalized label density in the overall population 

for M{t) > 



m{t,x) = { M{t) EieNo^^W 

otherwise 

can be derived. These are the two experimentally observable variables of the 
system. The overall population size M{t) can be determined by cell counting, while 
the population density m(t, x) can be assessed by the labehng with CFSE or BrdU. 
By combining these two, M{t, x) can be reconstructed. As there is currently no 
direct cell division marker available, experimental assessment of the subpopulation 
sizes or of the label distribution within the subpopulations is in general not feasible. 
All common experimental techniques only provide the marginalization over the 
division number i [TTl [28l 129] . 

3 Analysis of division- and label-structured pop- 
ulation model 

Besides the advantages the DLSP model offers, its potential drawback is its com- 
plexity. The model is a system of coupled PDEs, which are in general difficult 
to analyze, and their simulation is often computationally demanding or even in- 
tractable. In the following it is shown that these problems can be solved for the 
DLSP model Q. The approach presented allows to efficiently compute the solu- 
tion of the DLSP model, without solving a system of coupled PDEs. 

3.1 Solution of the DLSP via decomposition 

In order to provide an efficient method for computing the solution of ([T]), we define 
the initial number of cells 

No,o= / No^o{x)dx (9) 
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and the initial label density 

f foriVoo>0 
no,o(x) = <^ iVo.o • (10) 

otherwise, 

according to ^ and Q. Given these definitions the following theorem holds: 
Theorem 1. The solution of model ([T]) is 

Vi: Ni{t,x) = Ni{t)ni{t,x) (11) 

in which: 

(i) Ni{t) is the solution of the system of the ODE: 
^ = 0: ^ = -(ao(t) + /9o(t))iVo, 



(12) 



V« > 1 : ^ = - Mt) + (3i{t)) Ni + 2a,„i(t)iV,_i 

with initial conditions: Nq{0) = Nq^q and Vz > 1 : iVj(O) = 0. 
(a) ni{t,x) is the solution of the PDF: 

duiit.x) d{xni{t,x)) 
v.. ^^-Mt)^^ = (13) 

with initial conditions Vi : nj(0,x) = 7*^o,o(7*3;). 

The state variables Ni{t) and ni{t,x) of the ODE system and the PDEs corre- 
spond to the number of cells ([s]) and the label density ^ in the ith. subpopulation, 
respectively. 

Proof. To prove that Theorem [l] holds, (|lT|) - g are inserted in ([T]) and it is 
shown that the resulting equation holds. The proof is only shown for i > 1, since 
the case i = can be treated analogously. Furthermore, for notational simplicity 
the dependence of ni(t, x), ai(t), and f3i{t) on t and x is omitted where not required. 
Inserting ( [lT| in ([T]) for i > 1 yields 

-k ^ = -(«i + A)iV,n,(t,x) + 27ai_iiVi_ini_i(t,7x). (14) 

The left hand side of this equation can be reformulated: 

dNiUi , dixNiUi) dNi - drii , - dixnA 
— ^ - k ^ ' ' = — -Hi + N~ - kNi \ ' 
ot ox dt ot ox ,^ p.x 

dt ^ \ dt dx J dt 



8 



(16) 



By inserting this resuh in (14) and substituting dNi/dt with (12), we obtain 

(- (ai + A.) + 2ai_iiVi_i) mit, x) = 

- {ai + f3i) Nini(t,x) + 2-fai_iNi_ini_i(t,jx), 

which can be simphfied to 

ni(t, x) = jrii^iit, 7x). (17) 

It can be proven that this last equality holds, e.g., by using the analytical solution 
of (13), which can be found below. This yields that (17) holds which concludes 
the proof of Theorem [1} □ 



Remark 1. Note that it can be verified that (17) holds if and only if the label loss 
rate //(t, x) is linear in x. 

With Theorem [ij the original system of coupled PDEs can be decomposed into 
a system of ODEs ((T2|) and a set of decoupled PDEs (13). This means that the 



size of the individual subpopulations can be decoupled from the label dynamics. 
This already tremendously simplifies the analysis, but a further simplification is 
possible: 

Corollary 1. The solution of model ([T]) is 

Wt : Niit,x) = iV,(t)ye-/o'=M'^-no,o(ye^o'=(^)''^a;), (18) 



in which Ni{t) is the solution of the ODE (12). 

Proof. To prove Corollary [l] note that the PDE (13) is linear. Thus, the method of 
characteristics [3l] can be employed to obtain an analytical solution (Appendix [A]). 
This yields 



Vi : ni{t,x) = 7'e-^o'=(")^"no,o(Ye^o'^M^-a;) 
which can be inserted into (11), proving Corollary [T] 



(19) 

□ 



The general solution ni{t,x) simplifies in cases of specific choices for k{t). A 
constant degradation rate yields 



Wi : ni{t,x) = ''^no^oiYe'^^x), 
while for a Gompertz decay process one obtains, 

VI : ni{t,x) = Ye '^o,o(7 e'^2^ 



X). 



(20) 



(21) 



Corollary [T] provides a solution for any label degradation rates, including those 
considered in ^321 E3] • 

By solving the decoupled PDEs analytically, the solution of the DLSP model 
can be obtained in terms of the solution of a system of ODEs. This reduces the 
complexity drastically and enables also a compact representation of the overall 
label density M{t,x): 
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Corollary 2. The overall label density ^ 

M{t,x) = Y,N^{t)n,{t,x) = ^iV,(t)f e-/o'=W'^-no,o(ye/o'=M'^-x), ^22) 

in which Ni{t) is the solution of the ODE ( [l2| ). 

Proof. By substituting ( jlsj ) into ([6]), Corollary |2] is proven. □ 

Given Corollary [l] and [2| it is apparent that merely the ODE system ( 12 ) 
has to be solved in order to compute the solution of the DLSP. This problem is 
approached in the remainder of this section. 



3.2 Calculation of the subpopulation sizes 



In order to solve ODE system (12), we note that the change of subpopulation i 



only depends on the size of subpopulation i — 1. This chain-like structure enables 
the solution of Ni{t) via recursion. By doing so, analytical solutions for the ODE 
system have been found for two cases |S1 |2I] : 

Lemma 1. Given that Vi G Nq : ai(t) = a > A Pi{t) = (5 > 0, the solution of 



(12) is: 



(23) 



This result has been derived in |2T] , where the authors studied this ODE system 
to model the number of cells that have undergone a certain number of divisions, 
without modeling label dynamics. The derivation as provided in Appendix |B] is 
generalized for later use. 

Lemma 2. Given that Vi G Nq : 0!i{t) = ttj > A /3j(t) = /3j > and \/i,j G 
No, i j : ai + (3i aj + f3j, then the solution of (12) is: 



1 = 0: No{t) 
Vi > 1 : Ni{t) 



' i 



in which 



D,(t) 



E 

j=0 



n(K+/3fc)-(«.+/3.)) 



-(a,+/3j)t 



(24) 
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Solution (24) was first stated in [5] and for completeness the proof is provided 
in Appendix [Cj It basically employs mathematical induction in the frequency do- 
main, exploiting properties of the partial fraction under the provided assumptions. 
Despite the prerequisites, this result is quite powerful as for almost all cases of time 



invariant division number dependent parameters a, and /3j the ODE system (12) 
can be solved analytically. 

In cases in which neither prerequisites for Lemma[T]nor|2]hold, then the solution 



of (12) can still be computed using numerical integration. This is possible if only 
the sizes of the first S subpopulations iV'o(t), Niit), . . ., Ns-i{t), are of interest, 
where S is finite. 

3.3 Truncation of division numbers in the population model 



In Section |3.1| a decomposition approach has been described to decouple the size 
of the subpopulations from the label distribution in the individual subpopulations. 
While this simplifies the computation of the properties of individual subpopula- 
tions drastically, the analysis of the overall label density and of the overall pop- 



ulation size still requires the calculation of an infinite sum (22). Even in cases 



for which the individual subpopulation sizes are available analytically (see (23) 



and (24)), we could not derive a closed form solution for M{t,x). Therefore, in 



this section we present a method to find an approximation of M(t, x) of the form 

5-1 5-1 

Ms{t, x) = J2 Niit, ^) = Y1 ^i(t)ni{t, x) (25) 

i=0 i=0 

with truncation index S* > 0. Instead of considering an infinite number of subpop- 
ulations, only the first S subpopulations are taken into account. While it might 
be argued that a bound S can be determined from experimental data collected 
in proliferation assays [32l 133] , this is not true for long times. In case of long 
observation intervals, the autofluorescence - which will be discussed in Section |5] 
- avoids an estimation of S. Thus, reliable selection rules for the truncation index 
S are necessary. 

In order to approximate M{t, x) with arbitrary precision by the truncated sum 
Ms{t,x), convergence of ^ and ([T]) with respect to the subpopulation index i is 
required and can be proven: 

Theorem 2. The sums (|6| converge for any finite time T, if there exist 

"sup = sup ai{t) > 0, 

i6[0,T],jeN„ 

ainf = inf ai{t) > 0, (26) 
Anf = inf A(t) > 0. 

t€[0,T],i&% 

The proof of Theorem [2] is provided in Appendix lEl It employs a system of 



ODEs of which its states are an upper bound for the states of (12), and which 
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can be solved analytically. Given these upper bounds the comparison theorem for 
series [35] can be used to verify convergence. Note that Theorem |2] is powerful as 
it holds for all biological plausible functions ai{t) and (3i{t). 

Given convergence the question arises how large the truncation index 5* must 
be to ensure a predefined error bound at a given time. For the considered system 
it can be shown that: 

Theorem 3. Given a truncation index S and a time T, as well as ainf, Osup; one? 
/3inf as defined in Theorem^ the truncation error is upper bounded by Es{T): 

5-1 ^.^^ 

(amf+ftnf)r 



\mT,x)-Ms{T,x)\U ^ „ ,„ C,„.,„r ^ (2a.„pr) 



e 



(27) 

To prove Theorem|3, we show that \\M(t,x) — M{t, x)\\i = X]i^5+i ^j(^)- This 
sum can be upper bounded for all biologically plausible functions ai{t) and f3i(t) 
using the ODE system employed to verify Theorem |2] The full proof is provided 
in Appendix |Fj Note that, if Vi G No : ai{t) = a A Pi{t) = /3, the bound ([27]) is 
precise and equality holds. 

Remark 2. In this work we considered an error bound which is relative to the 
initial condition. This is reasonable as for this system the superposition principle 
holds and the relative truncation error ^^^^^'^^\^m~(ox)\[i'^^^^^ ^^^^ independent of 
||M(0,a;)||i. 

Given Theorem|3| an upper bound 5* can be derived which ensure that a relative 
error is bounded by e: 

Corollary 3. Assuming that ai^f, Osup; O'lT'd Anf exist as defined in Theorem 
the error bound 

\\M{T,x) - Ms{T,x)\\i 

holds if 

2«..T ^(2«supT)^ 



g_.pT _ J2 ^^"^"P^ ^ e-("in,+ftnt)T < ^ ^29) 
i=0 ^' / 

Proof. Corollaryjsjfollows directly from Theoremjsj using ^^^^'''^\M(o^j\\'i^'^^^^ — ^sC^) — 
e. □ 

Despite the generality of Theorem [3] and Corollary |3] for the considered system 
class, it suffers the small disadvantage that no explicit expression for S has been 
found. Rather, the minimum truncation index S which is required to ensure a 
certain error bound has to be found iteratively by increasing or decreasing S based 
on the current error. Fortunately, this search is computationally cheap as it is not 
necessary to solve a system of ODEs or PDEs, but the error bound is available 
analytically. 
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A study of the a priori error bound (29) shows that if the acceptable relative 



error e is kept constant, the truncation index S grows monotonically as function of 
the final simulation time. This is due to the exponential growth of e^°""p-^ vs. the 
polynomial growth X]£o^ (2asupr) ^ yiQ^ely for cases in which 2q;sup < ttinf + Anf, S 
does not have to increase arbitrarily over time but stays bounded, as under these 
conditions the population dies out. Note that the increase of S is often not critical. 
Due to label dilution in general only the first seven or eight cell divisions can be 
observed [IB], which limits the timespan of interest and therefore the required 
truncation index S. 

Aside from an approximation of the population density M(t,x), also approx- 
imations for the overall population size M{t) and of the normalized overall label 
density m{t,x) may be necessary to compare model predictions to measurements. 
Accordingly to ([t]) and ([s]), plausible choices for these approximations are 

Ms{t) = I Ms{t,x)dx and ms{t,x) = (30) 
Jr+ Ms{t) 

Theorems [2] and [3] can be extended to verify convergence and determine truncation 
errors for these quantities. For Ms{t) this is straightforward, while for ms{t,x) it 
is slightly more complicated. The proofs are not provided here as this is beyond 
the scope of this work and would reduce the readability. 



To summarize, in this section the DLSP model has been analyzed in-depth. We 
have shown that for a very general class of division and death rates ai{t) and /3j(t), 
the solution of the DLSP can be computed by solving a system of ODEs. This ODE 
system has an analytical solution for a rather general class of time independent 
parameterizations. By determining rigorous error bounds, we furthermore enable 
the calculation of the required truncation index to achieve a predefined precision. 
As shown later, this will allow for many systems to predict the population response 
employing a low-dimensional ODE system. 



4 Comparison of different proliferation models 

In the last section we have analyzed the DLSP model and outlined a method to 
solve it. The question which remained open is how the DLSP model and its solution 
relate to existing population models for cell proliferation. To answer this question 
we confine ourselves to the in our opinion most common models, the exponential 
growth model (EGM), the division-structured population model (DSP) and the 
label-structured population model (LSP): 

EGM: An ODE describing the dynamics of the overall population size [T]. 

DSP: A system of ODEs describing the dynamics of the number of cells 
contained in the individual subpopulations, where the subpopulations are 
defined via a common number of cell divisions [211 H] ■ 
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LSP: A PDE describing the dynamics of the label density in the overall 
population [29|fT7ll28]. 

These models are used in many more publications than cited here and various 
extensions of these models exist. 



4.1 Relation between EGM and DLSP 

The EGM is the simplest available model which describes population dynamics. 
It has only one state variable, which corresponds to the size of the overall cell 
population. In general, the EGM is written as 

^^^^ = 0(t)M'^G^(t), M^«^(0) = M^"""^, (31) 

in which (/)(t) is the effective growth rate. A common choice is 0(t) = e'^^~'^'^^ which 
results in a Gompertz equation [1]. 

As the EGM only describes the overall population size, it is contained in the 
DLSP. By choosing ai{t) = (j){t), l3i{t) = and iVo,o = M^^^, the overall popu- 
lation size M{t) predicted by the DLSP is equivalent to M^^^{t). This can be 
shown using the time derivative of M, 

^ = E ^ = E = (32) 

which has the initial condition Mq = XlteN ^«.o — ^o"*^^- 



4.2 Relation between DSP and DLSP 

In contrast to the EGM, the DSP resolves the subpopulations, and the state vari- 
ables iYP^^(t) correspond to the number of cells which have divided i times. To 
our knowledge this model has first been proposed in [21] and its most common 



form is equal to (12). Thus, the DSP is contained in the DLSP and is obtained by 
marginalization over the label concentration x. Actually, according to Theorem [T| 
a DSP model is solved to compute the solution of the DLSP. As for the PDE com- 
ponent of the DLSP an analytical expression can be derived (Corollary [T|, solving 
the DLSP model has basically the same complexity as solving the DSP. 



4.3 Relation between LSP and DLSP 

For the comparison of model predictions and labeling experiments with CFSE or 
BrdU, the LSP model has been introduced [291 113 EH]. The state variable of 
the LSP denote the label density M^^^{t, x) in the population. In general, the 
evolution of M^^^{t,x) is modeled by the PDE 

dM^^^{t,x) d{u{x)M^^^{t,x)) _ 

Ft ^ ~ (33) 

- {a{t, x) + (3{t, x)) M^^^{t, x) + 27a(t, x)M^^^{t, -fx), 
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with initial condition M^^^{0,x) = Mq^^{x) [29j. As this model allows for label 
dependent division and death rates, a{t,x) and (3(t,x), it is in this respect more 
general than the DLSP. 

However, it is not obvious why the cell division or death rates should depend 
on the label concentration. If the experiments are performed at low label concen- 
trations far from the toxic regime, the population dynamics should be independent 
of the labeling [T6],|36]. In particular, complex dependencies of a(t,x) and /3{t,x) 
on the label concentrations x, like those shown in [29], are hard to argue. Addi- 
tionally, a recent study supports that the introduced nonlinearities are correlated 
with the division number pU] . 

Therefore, we just consider division and death rates which solely depend on 
time t, a(t) and f3{t). As proven in Appendix G, for this case, the solution M{t, x) 
of the DLSP, with ai{t) = a{t) and (3i{t) = ^(t) and Noflix) = M^^^{x), is equiv- 
alent to M^^^{t,x). This shows that under these assumptions, the information 
provided by the LSP is a subset of the information available from the DLSP. This 
renders the DLSP more useful, as also subpopulation sizes are accessible. 

Furthermore, for time dependent a{t) and P{t), the solution of the DLSP can 
be approximated by a low-dimensional ODE system (Theorem [2] and |3| . Hence, 
instead of computing M^^^{t, x) using a PDE solver as done in all available publi- 
cations, one may solve only a low-dimensional ODE system. Using the analytical 



results for the ODE system (12) even analytical solutions are available, e.g.. 



MLSP(t, x) = e-("+^)*e'=* ( ^ ^^^p^M^""^ {0, fe^'x) ) , (34) 



for constant rates a and (3. Although this result for the LSP may be helpful to 
study various systems, we have not found it in the literature yet. The reason 



might be that a direct derivation of (34) is rather complex, whereas the study of 
the DLSP renders it straightforward. 

Clearly, label dependent cell division and death rates or constant label loss rates 
were not considered here, in contrast to what was done in [291 HZl EH]- This was 



avoided as the decomposition of the solution shown in Section [371] becomes impos- 
sible and solving the DLSP model gets computationally challenging. Nevertheless, 
the loss of these degrees of freedom is compensated by allowing for biologically 
more plausible division dependent cell parameters in the DLSP. 



4.4 DLSP as a unifying modeling framework 

The implications of the findings in Section |4.1j|4.3| are that the three most preva- 
lent classes of population models are captured by the DLSP. Furthermore, it is 
more general, as label distributions and division dependent parameters may be 
considered, which are both important and well motivated from a biological point 
of view. Figure |3] illustrates the relations and shows how the EGM, the DSP, and 
the LSP may be constructed from DLSP via marginalization. 
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Figure 3: Illustration of the relation between the exponential growth model 
(EGM), the division-structured population model (DSP), the label-structured pop- 
ulation model (LSP), and the division- and label structured population model 
(DLSP). The models are distinguished using two properties, the availability of 
division numbers (vertical axis) and of information about the label distribution 
(horizontal axis). Arrows indicate whether and arrow labels describe how a model 
can be obtained from another model. It is apparent that the DLSP model is 
the most general model, as all remaining models can be constructed from it via 
marginalization. 



In contrast to the generality, the simulation effort increases only marginally 
when studying the DLSP instead of the DSP or the LSP. This is due to the 
decomposition into a system of ODEs (which is equivalent to the DSP), and a single 
set of PDEs. The set of PDEs can be solved analytically, and in several cases even 
analytical solutions for the ODE exist, facilitating an analytical solution of the 
overall system. Such analytical solutions can then be used to determine previously 



unknown analytical solutions for DSP and LSP, e.g., like (34). 



Remark 3. Obviously, there exist extensions of the LSP and the DSP which are 
not captured by the current version of the DLSP. Examples are the aforementioned 
label concentration dependent division and death rates for the LSP f2Bl \29^ as 
well as DSP models with recruitment delay [241 While the DLSP model can 
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easily be extended to take such effects into account, the numerical analysis will get 
more challenging. 

5 Computation of measured label distribution 

In the last section, we related the division- and label-structured population model 
to existing models. In this section, the prediction of the DLSP models will be 
related to data collected in proliferation assays. 

5.1 Autofluorescence and measured overall label distribu- 
tion 

As outlined in the introduction, to obtain quantitative information about the pro- 
liferation dynamics, the fluorescent levels of individual cells are assessed using flow 
cytometry [18]. The fluorescence level of an individual cell, y G M+, summarizes 
the label induced fluorescence, x, and the autofluorescence, Xa, 

y = x + Xa. (35) 

The background, which might be interpreted as measurement noise, avoids a pre- 
cise reconstruction of the label concentration. Furthermore, it limits the number 
of cell divisions which can be observed. While the label induced fluorescence, x, 
halves at cell division, this is not true for the autofluorescence. As the initial 
label concentration cannot be arbitrary high to avoid interference with the cell's 
functionality and toxicity, even for highly optimized labeling strategies only six to 
eight division can be observed before the observed fluorescence becomes indistin- 
guishable from the background fluorescence |18j . 

To address these problem a modifled label-structured population model is intro- 
duced in [30] for the case of constant background fluorescence, Xa- This modifled 
label-structured population model directly describes the evolution of y, account- 
ing for the facts that (1) only x is divided among daughter cells and (2) only x is 
degraded over time. Unfortunately, this complicates the numerical treatment - for 
this model no analytical expression for the label evolution is known - and does not 
allow for the a separate analysis of the contributions. Furthermore, experiments 
showed that the background fluorescence varies among cells [18]. The autofluo- 
rescence, also called background fluorescence, is a stochastic variable Xa ~ p{xa), 
which is independent of the level of label concentration. The distribution of Xa, 
p{xa) '■ M+ — >■ M+, with f^^p{xa)dxa = 1, cau be assessed using control experi- 
ments [IHl El ES] • 

In this section, we propose an approach to predict the measured distribution 
of fluorescence, while explicitly distinguishing label dynamics and measurement 
process. The label dynamics are described by the DLSP model and the mea- 
sured distribution of fluorescence is simply the convolution of the label induced 
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fluorescence, M(t,x), and the autofluorescence distribution, p{xa) 



M(t, x)p{y — x)dx. 



(36) 



Hence, the measured fluorescence distribution M''^{t,y), which is a number den- 



sity function, can be obtained by simulating (22) and computing the convolution 



integral (36). This is comparable to results described in [32l [33], where Xa is par- 
tially contained in the model. However, the decomposition of the computation 
of My{t,y) in dynamics and measurement is far more intuitive than a combined 
model as in |30l |32l [33] which combines the effects. 



5.2 Efficient approximation of measured overall label dis- 
tribution 

It has been shown that the overall label distribution, M{t,x), can be computed 
efficiently using the simulation of a low-dimensional ODE model and the analytical 
solution of a simple PDE. Unfortunately, this efficiency is corrupted by the need 



for solving the convolution integral (36). A repeated evaluation, as required for 



parameter estimation (see, e.g., ^30j), results in a large computational burden. 

To reduce the computational complexity, we propose an approximation for 
M'^{t,y) of M^{t,y) which can be computed without integration. To allow for 
this approximation, we assume that the initial condition is a weighted sum of 
log-normal distributions, 

J 

No,o{x) = Nofi log Ar(x|/i^o, (37) 



with fraction parameters G [0, 1], with J2j=if'^ = parameters fJ,Q,al G IR+, 
and 

1 noe(x)-^^ \2 

,x<0. 

The faction parameters, f^, determine which fraction of cells belongs to which log- 
normal distribution. The number of different log-normal distributions is denoted 
by J G N. In addition, we restrict the measurement noise to be log-normally 
distributed, p{xa) = log ^/'(xal/Xa, cx^). These two assumptions are not restrictive, 
as any smooth distribution can be approximated arbitrarily well by a sum of log- 
normal distributions and as autofluorescence levels are known to be approximately 
log-normally distributed (see, e.g., [E]). 



Given (37), it can be shown that the label distribution in the individual sub- 
population is 



N,{t,x) = N,{t)yjnogmx\fil{t),{ai)') (39) 



i=i 



18 



with n{{t) = — ilog(7) — Jq k{T)dT + /i;^ (for proof see Appendix h]). This follows 
directly from the analytical solution of ni{t,x). Thus, log-normal distributions 
are conserved under the considered class of partial differential equations, and log- 
normal initial conditions result in log-normal label distribution for t > 0. This 
implies that also the label induced fluorescence distribution is a sum of log-normal 
distributions, 

J 

M{t,x) = J2m,^) = 5^iV.(t)^/nogAr(x|/xl(t),K)2) (40) 

ieNo ieNo j=i 



By inserting this in the convolution integral (36), we obtain by linearity of inte- 
gration 

/■oo 

J poo 

= E ^(^) E / ^og^^ix\^£^it), [aif) \ogU{y - al)dx. (42) 

The individual summands of M^(t, y), Nf{t, x) = Ni(t, x) \ogM{y—x\fia, o'Ddx, 
are the measured fluorescence distributions in the subpopulations defined by a 
common division number. Therein, the summands of Nf{t,x), 

POO 

nf\t,x) = / \ogX{x\^ilit),{alf)\ogAfiy - x\^ia,cTl)dx, (43) 
Jo 

describe the contribution of the j-th log-normal distribution in the initial con- 
dition to n\{t,x). This can be traced back as the superposition principle holds. 
Apparently, the efficient assessment of My{t,y) is possible, using an efficient com- 
putational scheme for computing nf''{t,x). 

The probability density nf'-'{t,x) is the probabihty density of the sum of two 
log-normally distributed random variables. Although, this density is of interest in 
many research fields (see [23 EH] and references therein), no analytical formula for 
computing 7v('-'[t,x) is known. Still, several approximations are available. One of 
the most commonly used approximation has been proposed by Fenton [37j . Fenton 
employs the fact that although the distribution of the sum of two log-normally 
distributed random variables is not log-normal, it can still be closely approximated 
by a log-normal distribution. In [37], this approximating log-normal distribution 
is chosen to have the same first two central moments, mean E^'^(t) and variance 
Varf^(t), as the actual distribution of the sum. 

The time-dependent central moments of n\'\t,x) are the sums 

Ef(t) = E^(t) + E„ (44) 
Varf ^(t) = Var]' {t) + Var„, (45) 
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of the time-dependent central moments of the label distribution of the i-th sub- 
population, E](t) and Var](t), and the static autofluorescence, Ea and Var^, as it 
is known from basic statistics [39] • These central moments are 



E|(t) =e^^We^, (46) 

Var^t) = e^^^+^^^of (^e^<^i)' _ . (47) 

for the label distribution and 

Ea{t) = e^'^e-^, (48) 

Var,(t) = e2^"+("")' (e^"'')' - l) . (49) 



for the measurement noise. Following ^7], the log-normal distribution exhibiting 
the same overall mean and variance has parameters 

f^'it) = log(Ef (t)) - ^ log + 1 1 , (50) 



10.1^.11, (51) 



\ \ Ef^(t) 
yielding the approximation 

n^'^t, x) = \og^^{x\flfy{t), (af^(t))^) (52) 

of nf"'(t, x). Own studies revealed (not shown), that this approximation is for 
narrow distributions almost indistinguishable from the true distribution. In par- 
ticular, if one of the distribution becomes narrow, the approximation can be made 
arbitrary good. This is helpful, as the precise parameterization of the initial condi- 
tion might be a degree of freedom, which can be used to regulate the approximation 
quality. 

Given the approximation of nf'-'(t,a;), the approximation 

J 

My{t,y) = Y,N,{t)Y,r\ogN{x\f^^\t),{af{t)f). (53) 

of the measured fluorescence distribution can be computed. This approximation 
is the sum of log-normal distributions those parameters can be computed analyt- 
ically. Therefore, it merely requires the evaluation of the log-normal distribution 
at different points, which can be made fairly efficient using lookup tables. The 



approximation (53) can be determined orders of magnitude faster than the actual 



convolution integral (36) used, e.g., in [321 [33]. Apparently, this approximation 
can also be combined with the truncation introduced in the last section. 

Similar to the actual value, the approximation M^(t,|/) might be employed 
to perform parameter estimation. There, M^(t,y) is compared directly [28] or 
indirectly [IZl [29], [30] with the measured flow cytometry data. This enables the 
inference of the model parameters, for instance, proliferation and death rate. 
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Remark 4. Parameter estimation for the DLSP model is beyond the scope of this 
work. We focus on the development of modeling and simulation tools for structured 
cell population, which might in a second step he employed to infer parameters. 



6 Example: Population with division number de- 
pendent parameters 

To demonstrate the properties of the DLSP model, an illustrative simulation study 
is performed. Therefore, a hypothetical cell population system with division num- 
ber dependent proliferation rates ai is considered. The existence of division num- 
ber dependent proliferation dynamics is known for many cell systems [H [TOl [20] . 
whereas the magnitude of the effect varies between them. This example shall illus- 
trate the power of the DLSP and the proposed numerical procedure and therefore 
does not focus on a particular biological system. 

The hypothetical cell population is assumed to have an initial proliferation rate 
of a = 0.02 [1/hour], corresponding to an initial doubling time of 35 hours. This 
initial proliferation rate changes upon cell division. It is assumed that the prolifer- 
ation rate decreases exponentially, Vz : = ae~^"* [1/hours], with Aq, = 0.23 [-]. 
This rate law is based on the findings in [20J and results in a reduction of the 
proliferation rate by a factor of 2 when proceeding through 3 generations, thus 
Qfj+s = The cell death rate is set to a constant value, Wi : f3i = (3 = 0.001 
[1/hours]. Concerning the labeling, a log-normal initial label density Nofi{x) is 
assumed, as observed in many studies, e.g., [171 [2HI [22] • The label dilution factor 
and the degradation rate are set to 7 = 2 [-] and k = 0.003 [Ul/hour], respectively, 
in which UI denotes the unit of label intensity. The autofluorescence is assumed 
to be log-normally distributed with fia = 2.5 and cTq = 0.3. All parameter values 
are comparable to those available in the literature [T71 [28], [29] . 

The resulting cell population model is simulated for t e [0,6] days. The label 

density M2o(t,x) and the size M2o(t) of the overall population are depicted in 
Figure |4| Both quantities are computed using a truncation index of S" = 20, thus 
merely the first 20 subpopulations are taken into account. This already ensures a 
small truncation error. 

The actual truncation error and the bound for the truncation error are now 
studied in more detail. As no analytical solution is available, we compare all 



results to Mioo(^, a^)- For this case, the analytical expression (27) for the error 
bound yields Es{t) < 10~^° over the whole time interval t G [0, 6] days. Therefore, 
^100(^)2;) is considered as the exact solution. Given MiQQ{t,x) the truncation 



error is evaluated. From the results depicted in Figure 5(a) it is apparent that 
over the considered time interval, already S* = 11 provides an error smaller than 
10~^. This illustrates that a small number of subpopulations is sufficient to obtain 
a good approximation of the population density. This result is also supported 



by the derived truncation error bound Es{T) (Figure |5(b) ), while the expected 



truncation error is overestimated. This is visible for instance for T = 6 and 5* = 11, 
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Figure 4: Label density in cell populations at different points in time, computed 
from the first S = 20 subpopulations. In order to ensure comparability with 
common histograms plots whose bins are logarithmically distributed, the label 
density is multiplied with the label concentration x. 



where the truncation error bound is several orders of magnitude higher than the 
actual truncation error. 

To assess the truncation error bound more precisely, we compute the minimal 
truncation index S required to ensure a predefined error bound e. This analysis is 
performed using the exact truncation error (blue) and the truncation error bound 



(orange). The results are depicted in Figure [6| where Figure 6(b) shows the time 
dependency and Figure [6 (a) [ shows the error level dependency of the minimal trun- 
cation index S. As verified previously, the computation from the exact truncation 
error yields lower truncation indices. The maximal observed difference for this sys- 
tem is a factor of 2. The difference increases over time and interestingly, for this 
system, the truncation index 5* as a function of T approximates a line with a slope 
of 2.5. While the slope is problem dependent, this effect has been observed for all 
considered systems. It probably originates from the structure of the truncation 
error bound (29). Besides the time dependency, the index S depends also on e. 
When e is decreased by a factor of 10, the index S has to increase by 2. This is a 
quite reasonable scaling and allows for very good approximations. For the system 
at hand, the analysis of the exact truncation error shows that 5 = 10 ensures an 
error of 0.01 % of the original population size. Employing the truncation error 
bound we compute M{t). Thus, the truncation error is overestimated by a factor 
of 2 but this number is computed without simulation and available even if the 
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Figure 5: Truncation error (a) and truncation error bound (b) as function of the 



truncation index S and the final time T. 



exact solution is not known. Given the upper bound of the truncation error, we 
can verify a priori that a very good approximation {Es{T) < 10~^) of the solution 
of the coupled system of PDEs Q can be calculated by solving a system of 20 
ODEs. This reduces the computational effort drastically. 

Aside from the computational speed-up, the DLSP provides information about 
the overall label density and the size of the subpopulations. The former allows 
for the comparison of model prediction to labeling experiments, while the latter 
allows for the assessment of population properties like the mean division number 
(Figure It]). These quantities are of interest in many studies, in which a precise 
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Figure 6: Truncation index S required to ensure a maximal error e at a given 
time T. 
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understanding of the proliferation dynamics is of crucial importance. 

Beyond the analysis of model properties, also an comparison of model pre- 
diction and measurement data is of interest. Therefore, the measured fluores- 
cence distribution is required, which can be computed using (36) or approximated 
using (53). For the problem at hand, the true and the approximated solution 
are indistinguishable, while the approximated solution can be computed orders of 
magnitudes faster. The distribution of the measured fluorescence is depicted in 
Figure |8] Similar to [30j , this simulation results shows that after a certain number 
of cell divisions, cells with different division numbers cannot be told apart any 
more. This is mainly caused by the halving of label concentration at each cell 
division, but also by the label degradation, resulting in an increased importance 
of the cellular autofiuorescence. 



7 Conclusion 

In this work, we have proposed a division- and label-structured population model 
which provides a unifying framework to study proliferating cell populations un- 
dergoing symmetric cell division. This model is based upon own work in [31] and 
considers both, continuous label dynamics and discrete division number depen- 
dent effects, such as cell aging. The resulting model is a system of coupled PDEs, 
which, under biologically plausible assumptions, can be split up into a system of 
ODEs and a set of decoupled PDEs. Each PDF describes the label distribution 
within one particular subpopulation and the ODE model describes the number of 
cells per subpopulation. 

We have shown that the model is a generalization of existing division-structured 
population models [H [5] and label-structured population models [171 [28], |29l 130] . 
Both model classes can be derived from the proposed model via marginalization. 
In contrast to these two existing types of models, the proposed model allows to 
incorporate division number dependent parameters as well as label distributions. 
The former one is important, as division number dependent parameters are found 
in many different cell systems and often are the subject of interest, while the latter 
one allows the direct comparison of model predictions and data. This supersedes 
complex and error-prone data analysis via deconvolution or peak detection [361 IHl 

Clearly, though the model provides generalization and unification of several 
classes of population models, there remain models which are not covered. Exam- 
ples are age-structured population models [26l ETJ SOI SH HSl HS], size-structured 
population models [121 HI], and general population balance models [ISl HB]. Fur- 
thermore, the size- and scar-structured population model for the asymmetrically 
dividing budding yeast has to be mentioned [2]. There is quite a theory of popu- 
lation model construction introduced in [17] . 

For the majority of these population models no analytical solutions are avail- 
able. To study the dynamic properties of the models quantitatively, finite dif- 
ferences, finite volume, or finite elements discretization schemes are applied and 
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Figure 7: Size of the subpopulations (a) and mean division number (b) computed 



using the division- and label-structured population model. 
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Figure 8: Overall fluorescence intensity, ?/ = x + x^, in cell populations at different 
points in time, computed from the first S = 20 subpopulations. In order to 
ensure comparability with common histograms plots whose bins are logarithmically 
distributed, the label density is multiplied with the label concentration y. 

the resulting ODE system is solved numerically (see, e.g., [2Hl[2n])- This need for 
numerical PDE solvers, which usually limits the state dimension to three to to the 
curse of dimensionality, is the main drawback of most populations models. It ren- 
ders the analysis complex and partially accounts for the observed focus on steady 
state analysis [261 123 SHI IH], while dynamical aspects are mostly disregarded. 
Furthermore, an in-depth analysis of the model and its parameter has merely been 
performed for one- dimensional systems. 

Besides its generality, the DLSP model can also be simulated efficiently. We 
have proven that the solution can be approximated by a low-dimensional ODE 
system, employing truncation. For the truncation error we have derived an a 
priori bound, which can be evaluated analytically. This lower bound can serve 
to determine the minimal model order / complexity required to achieve the desired 
approximation quality. This renders our model better applicable in cases where 
many other models, e.g., |l3l[30], come at a high computational cost. Also, these 
results can be used to allow a more rigorous reexamination of studies which employ 
the DLSP model, i.e., [321 [33]. 

In order to study the computational complexity, we have analyzed a cell pop- 
ulation model with division number dependent parameters. Our study indicates 
that, if only the first eight divisions are of interest, which is the case in many stud- 
ies [18], the system of coupled PDEs can be approximated well by a system of 20 
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ODEs. The associated low computational complexity for evaluation of the model 
predictions facihtates the in-depth analysis of the population model. In particular, 
parameter estimation and uncertainty analysis becomes more efficient. Therefore, 
besides the novel biological insight which can be gained using the DLSP model, the 
developed decomposition and truncation scheme should be seen as a tool for future 
advanced estimation procedures. This is also the case for the proposed approach 
to determine the measured fluorescence distribution from the label distribution. 
The common convolution integral formulation could be employed, but the approx- 
imation employing the log-normal distribution is more efficient and yields almost 
identical results. This renders the proposed approximation a useful tool, enabling 
a more detailed study of the system. It has been shown in [501 ED [52] that re- 
formulations of the model and the objective function may allow for a significant 
speedup of the optimization. 

In subsequent studies, estimation methods and inverse problem formulations 
developed for exponential growth models [Tj , division-structured population mod- 
els [H [5], and label-structured population models [HI EHl EHl ED], have to be 
adopted to apply to the DLSP model. This is also true for methods developed for 
size-structured populations [HI |53], age-structured populations [51] and general 
PDEs [55] , for which even convergence properties have been established. Employ- 
ing parameter estimation, e.g., for the T lymphocyte data published in [T7], novel 
insights regarding division number dependencies on the population dynamics can 
be gained as indicated by [321 133] and shown by own unpublished results. In addi- 
tion, due to the improved biological interpretation of the model, these results are 
expected to be far more reliable. 
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Appendix 



A Proof of analytical solution of PDE (13) 



To determine the solution of the PDE (13) the method of characteristics [31] is 



employed, which is possible as (13) is linear. The characteristics of (13) are defined 
by the ODEs 



doc 1 / \ dt dTiij^ 1 / \ 

- = -A;(t)x, - = 1, -j^ = k{t)n. 



(54) 



with x(0) = xq, t{0) = 0, and nj(xo) = nifi{xo). This system of ODEs has the 
solution 



x(r) = xoe-fo'^'^''^, t{T) = r, n,(r) = e/o^ ^(-)'^Xo(xo). 
By substitution we obtain 

n,{t,x) = e-^o'=M^-n,,o(e-^o'=M'^-a;) 



(55) 



as solution for (13). 



(56) 
□ 



B Proof of Lemma [T]: Solution of ODE system 

In this section we prove by mathematical induction that the ODE system 



^ = 0: ^ = -{a + (3)No, 



Vi > 1 : 



dt 

dt 



(57) 



(a + P)Ni + 2aNi^i 



with initial conditions A^o(O) = A^^o.o and Vi > 1 : A^i(O) = 0, has for a,a > and 
(3 > the solution: 



2! 



(5^ 



Thereby, (57) is a generalization of (23). 



It is trivial to verify that A^o and A^^i are the solutions of (57) for 2 = and 



2 = 1, respectively. Hence, only the problem of proving that Nk+i is the solution 

(59) 



of (70) for i = k + 1 given remains. To show this, note that 

i2ay 



(58) \/ieNo:M 



{s + a + /3y+^ 



No,o, 
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in which Mi is the Laplace transform of Ni. Given this 

dNk+i 



dt 

sSfk+i 



Substitution of Mk now yields, 

A4+1 = 



-{a + 15) Nk+i + 2aNk 
Afk. 



2a 



s + a + /3 



(2d) 



k+l 



{s + a + /3)'=+2 



0,0 



(60) 



(61) 



which by applying the inverse Laplace transformation concludes the mathematical 
induction and proves Lemma [T] □ 



a 



sup; 



Remark 5. Note that for a = a = a, (58) simplifies to (23). While for a 
« = «inf, /3 = Anf , Ni = Bi, and Nq^q = Bq^, we obtain the bounding system (70) 
and its solution. 



C Proof of Lemma [2]: Solution of ODE system 

In this section we prove that if 

• Wi : ai(t) = tti A f3i(t) = (3i and 

• Vi, j ENo,i ^ j : ai + (3i^ aj + (3j 



the solution of ( 12 ) is 



i = 0:No{t)= e-("o+^o)*Aro,o 
V« > 1 : NS) = 2' (jl'^^-^^ A(t)iVo,o 



in which 



A(t) = J2 

j=0 



( 



\ fc=0 
\k+3 



I 



(62) 



It is not difficult to verify that A^^o and A^i are the solutions of (12) for i = and 
z = 1, respectively. Hence, only the problem of proving that Nk+i is the solution 
of (70) for i = k + 1 given Nk remains. To show this, note that for 

]\]=i "i-i 



(62) Vi G No : A/'i = 2* 



n;=o(^+«.+/3.)' 



0,0) 



(63) 
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in which Mi is the Laplace transform of iVj. The proof of this relation is provided 
in Appendix [Dj 

Given (63) it follows that 



k+l 



dt 

O • sAfk+l 

^ A/'fc+i 

Substitution of A4 now yields, 

A4+1 = 



(afc+1 + (3k+i) Nk+i + 2akNk 

+ A/'fc+i + SofcA/'fc 
2«fc 



s + Ofe+i + (3k+i 



a4. 



nfe+i 
,=1 "i- 



-iVon 



(64) 



(65) 



which by applying the inverse Laplace transformation concludes the mathematical 
induction and proves (58). □ 



D Derivation of Laplace transform (63) 



To derive (63), we study the partial fraction of 



n;=o(s+«.+/3.) 



(66) 



As under the prerequisite Vi, j G Nq with i ^ j : ai + Pi ^ aj + f3j all poles are 
distinct, the partial fraction can be written as 



Afii 



N 

E 

.A:=0 



Cfe 



s + ak + f3k) 



0,0- 



(67) 



To determine the coefficients Cfc, we consider the equality constraint 

1 Cfc 



k=0 



s + ak + /3fc) 



(68) 



fc=0 j=l 



As this equality constraint has to hold for all s, it must be satisfied for s 
-(ttfc + /3fc), yielding 



Cfc 



-1 



(69) 
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Given the values for Ck one can easily verify (63) by plugging in the c^'s into (67). 
Obviously, the proposed procedure can also be inverted, which concludes the 
derivation of (63). □ 



E Proof of Theorem [2]: Convergence 

To prove Theorem |2| the comparison theorem for series [35] is applied. Therefore, 
we define the bounding system 



i = 0: 

Vi > 1 : 
with initial conditions 



dB> 







dt 
dBi 

~df 



- (ttinf + Anf) -Bo, 

— (ttinf + Anf) Bi + 2asupBi^i 



(70) 



i = 0: 5o(0) = iVo,o, V? > 1 : 5,(0) = 

and ainf, ctsup, and Anf as in Theorem [2] Due to the simple structure of (70), we 
can compute the analytical solution 



(2asup^) 



(71) 



whose derivation can be found in Appendix |Bj 

The bounding system ( 70 ) is obtained from ( 12 ) by reducing the outflows out 
of and increasing the inflows into the individual subpopulations. Intuitively, as 
the initial conditions of ( 70 ) and ( 12 ) are identical and the right hand side of ( 70 ) 
is for every t G [0,T] greater or equal than the right hand side of (12), it follows 
that Bi is an upper bound for Ni, 



yte[0,T],t: Bi{t)>Ni{t). 



(72) 



This can be proven rigorously by applying Miiller's theorem [56], as shown in [57] 
for another system. 

Given (71) and (72) one can prove the convergence of XlieNo ^iit^^)- To take 
into account that a distributed process is considered (a; > 0), we study the maxi- 
mum over X and define Bi{t) := Sj(t)7*e*=*n™P = (^^^^"p^)' ^ e-(°'"f+^'"f)*e^*iVJ"P with 
:= sup^{n()^o{x)} and N^^^ := sup^{Nofi{x)}. Thus, Bi{t) is a point-wise 
upper bound of Ni(t,x). For this definition of Bi(t) it holds that 

(i) ^i,t,x>0: < Ni{t,x) < Bi{t) Vz, and 

(ii) the series 



(2asup7t) 



-(ainf+ftnf)tgfc*JYSUp 



(73) 



i=0 



is convergent for every finite t. 
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The latter one holds true as the series is simply the Taylor expansion of the ex- 
ponential e^"''"^'^*. Under conditions (i) and (ii) it follows from the comparison 
theorem for series [35] that the series XlieNo «(^) convergent in i for every 
t G [0, T] and for every x > 0. This concludes the proof. □ 



F Proof of Theorem [St Truncation error 

To prove Theorem [3| note that 

oo 

I |M(T, x)-Ms{T,x)\\, = \\J2 HTKiT, x)\\, 

i=S 

oo „ 

= J2Ni{T) / n,{T,x)dx 

i=S 

oo 



(74) 



i=S 



in which the individual lines follow from the approximation methods (25), the 



fact that all quantities are positive, and the definition of the normalized label 



intensity (13) which has unity integral for all times T > 0. The remaining term 



in the following is successively upper bounded, for which we employ the bounding 
system (70). As shown in Appendix [e| it holds that Ni(t) < Bi(t) which yields 



(2asupT)^ 



-("inf + ftnf)TjY^^_ 



(75) 



i=S i=S i=S 

By completion of the sum, this can be written as 

5-1 



(76) 



i=S 



1=0 



Thus, by exploiting that ||M(0,x)||i = Nq^, one obtains (27), which concludes the 
proof. □ 



G Proof that the solution of LSP can be con- 
structed from DLSP 

To prove that the DLSP provides the solution to the LSP, M^^^(t, x) = M{t, x), we 



show that M{t,x) = XlieN ^) solves (33). Therefore, M{t,x) is inserted 
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in the left hand side (*) of (33), yielding 

, . d 



dt 



E 



J2Niit)ni{t,x)] -k 



dx 



ieNo 



XI 



I 



\ 



'^^'^%{t,x) + N.it) ( - f^dixn.it,x)) 



dt 



dt 



dx 



\ 



V 

=0 (with ^) 



J 



In here, dNi{t)/dt is substituted with (12), resulting in 

(*) = Yl (-(aW+/3W)iVi,(t)ni(t,x)) +^2a(t)A^,_i(t)ni(t^ 

ieNo ieNo 

This is equivalent to the result if M{t,x) is inserted in the right hand side (*) 
of (33). Hence, M(t,x) = XlieNo ^*(^)'^«('^' ^) fulfills (33) which concludes the 
proof. □ 



ieNo 
-(a( 



H Proof that the PDE (13) conserves log- normal 
distributions 



To prove that the PDE ( 13 ) conserves log-normal distributions, we use its analyt- 



ical solution (22) and consider nofl{x) = \ogAf{x\fio,aQ). This yields the solution 
ni{t,x) =7*e-^o'=W'^nogAr(ye^o'=M'^-a;|/io,a2). (77) 
Employing the definition of the log-normal distribution, this equation becomes 



/, f i fn k(T)dT \ 

I logl 7 e^u ^ ' xJ—^Q 



n. 



=fe-^o'=W'=^- 



l 



27rcTo (^7*e^ 

^ f loga^-(-slog7-/(j fc(T)tiT + po) 



'2TiaQX 

for X > 0, which can be restated as 



(7J 



(79) 



ni{t,x) = log U{x\iii{t),al), (80) 

in which yUi(t) = — zlog7 — k{T)dT + /iq. As this equation also holds for a; < 0, 
it follows that the log-normal distribution is conserved and merely the parameter 
/i is time dependent. Employing the superposition principle, this statement can 
be directly extended for sums of log-normal distributions, which concludes the 
proof. □ 
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